Improved Sum-of-Squares Lower Bounds for Hidden Clique and Hidden Submatrix Problems
نویسندگان
چکیده
Given a large data matrix A ∈ Rn×n, we consider the problem of determining whether its entries are i.i.d. from some known marginal distribution Aij ∼ P0, or instead A contains a principal submatrix AQ,Q whose entries have marginal distribution Aij ∼ P1 6= P0. As a special case, the hidden (or planted) clique problem is finding a planted clique in an otherwise uniformly random graph. Assuming unbounded computational resources, this hypothesis testing problem is statistically solvable provided |Q| ≥ C log n for a suitable constant C. However, despite substantial effort, no polynomial time algorithm is known that succeeds with high probability when |Q| = o( √ n). Recently, Meka and Wigderson (2013) proposed a method to establish lower bounds for the hidden clique problem within the Sum of Squares (SOS) semidefinite hierarchy. Here we consider the degree-4 SOS relaxation, and study the construction of Meka and Wigderson (2013) to prove that SOS fails unless k ≥ C n/ log n. An argument presented by Barak (2014) implies that this lower bound cannot be substantially improved unless the witness construction is changed in the proof. Our proof uses the moment method to bound the spectrum of a certain random association scheme, i.e. a symmetric random matrix whose rows and columns are indexed by the edges of an Erdös-Renyi random graph.
منابع مشابه
Chi-squared Amplification: Identifying Hidden Hubs
We consider the following general hidden hubs model: an n × n random matrix A with a subset S of k special rows (hubs): entries in rows outside S are generated from the (Gaussian) probability distribution p0 ∼ N(0, σ 0); for each row in S, some k of its entries are generated from p1 ∼ N(0, σ 1), σ1 > σ0, and the rest of the entries from p0. The special rows with higher variance entries can be v...
متن کاملThe Hidden Hubs Problem
We introduce the following hidden hubs model H(n, k, σ0, σ1): the input is an n × n random matrix A with a subset S of k special rows (hubs); entries in rows outside S are generated from the Gaussian distribution p0 = N(0, σ 0), while for each row in S, an unknown subset of k of its entries are generated from p1 = N(0, σ 1), σ1 > σ0, and the rest of the entries from p0. The special rows with hi...
متن کاملStatistical Limits of Convex Relaxations
Many high dimensional sparse learning problems are formulated as nonconvex optimization. A popular approach to solve these nonconvex optimization problems is through convex relaxations such as linear and semidefinite programming. In this paper, we study the statistical limits of convex relaxations. Particularly, we consider two problems: Mean estimation for sparse principal submatrix and edge p...
متن کاملOn the Statistical Limits of Convex Relaxations: A Case Study
Many high dimensional sparse learning problems are formulated as nonconvex optimization. A popular approach to solve these nonconvex optimization problems is through convex relaxations such as linear and semidefinite programming. In this paper, we study the statistical limits of convex relaxations. Particularly, we consider two problems: Mean estimation for sparse principal submatrix and edge p...
متن کاملLower bounds on the signed (total) $k$-domination number
Let $G$ be a graph with vertex set $V(G)$. For any integer $kge 1$, a signed (total) $k$-dominating functionis a function $f: V(G) rightarrow { -1, 1}$ satisfying $sum_{xin N[v]}f(x)ge k$ ($sum_{xin N(v)}f(x)ge k$)for every $vin V(G)$, where $N(v)$ is the neighborhood of $v$ and $N[v]=N(v)cup{v}$. The minimum of the values$sum_{vin V(G)}f(v)$, taken over all signed (total) $k$-dominating functi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015